Using a privately available dataset from kaggle.com, this research compares the performance of six well-known machine-learning approaches for predicting heart failure. which include Logistic Regression, Gradient Boosted Trees (GBT), Naive Bayes, Random Forest (RF), and Tree Ensemble. Heart failure is a major public health problem and it is necessary to improve the treatment of heart disease patients to increase the rate of survival. Delicacy was used to assess the performance of machine learning methods. RF produced the highest performance score of 80% when compared to Decision Tree Classifier and Tree Ensemble, Gradient Boosted Trees (GBT), Naive Bayes, and Logistic Regressions.
Introduction
I. INTRODUCTION
The heart is a muscular organ that pumps blood into the body, this process is known as circulation. Apart from this, the blood vessels and blood as a unit is a part of the cardiovascular system.
Heart conditions are the main reason for death worldwide. According to the World Health Organization (WHO), heart attacks cause 17.9 million people to die in 2019.
Using Machine learning algorithms, we can predict heart disease and also many disorders in the medical industry. So that we can save many lives and very easy to deliver successful treatments.
Symptoms of heart disease: coronary artery disease is a common heart condition, and it is very different for men and women. Chest pain, chest tightness, chest pressure, shortness of breath, pain in the upper belly area, weakness or coldness in legs or arms. Heart disease can also cause by irregular heartbeats.
II. LITERATURE SURVEY
Avinash Golande and colleagues [1] studied different ML techniques to predict heart disease. They studied Decision Tree, KNN, and K- Means. Decision Tree was the highest gained accuracy than other algorithms.
B.Gomathy et al. suggested a technique [2] that included data mining methods. According to this, the accuracy of 45% of the testing set was less accurate than the typical fuzzy artificial neural network.
Fahd Saleh Alotaibi [3] studied different machine learning algorithms such as Decision trees, Logistic Regression, Random timber, Naive Bayes, and SVM. The decision tree algorithm had the highest accuracy.
Umair Shafique et al. [4] employed data mining methods, decision trees, Nave Bayes, and Neural Network algorithms, and obtained a delicacy of 82% for Naive Bayes and 78% for the Decision tree.
Sabarinathan Vachiravel et al. [5] suggested a decision tree method to predict heart disease and obtained 85% accuracy.
N.Komal Kumar, G. Sarika Sindhu et.al. [6] used Random Forest, Logistic Regression, Support Vector Machine (SVM), and K- Nearest Neighbors (KNN). and obtained an accuracy of 85% for the random forest, 74% for logistic regression, 77% for SVM, and 68% for K-NN.
Malkari Bhargav et al. [7] used ANN, Regression Technique, Random Forest, Decision tree, SVM and KNN. Obtained the accuracy of 96% for ANN, 88% for Regression technique, 83% using SVM, and 68% using KNN.
Gayatri Ramamoorthy et al. [8] suggested different machine learning algorithms such as K-NN and Naïve Bayes, SVM and received the highest accuracy of 80% for K-NN, 65% for SVM, and 80% for Naïve Bayes.
Apurb Rajdhan et al. [9] suggested some machine learning techniques such as Decision tree, logistic retrogression, and naive Bayes and obtained an accuracy of 81% for Decision tree, 85% for Logistic regression, and 85% for Naïve Bayes.
III. .PROPOSED METHODOLOGY
A. Data Collection
The gathered dataset contains 700 records of case data and 11 characteristics. A dataset is information or a tool that is required to do any type of research or design. We gathered information from the dataset provider- Kaggle.com. [10] Fedesoriano is the author of the dataset. (September 20, 2021). Dataset for Predicting Heart Failure.
Date recovered from https://www.kaggle.com/fedesoriano/heart-failure- prediction.
B. About Dataset
Attribute Information
Chest Pain: chest pain not related to the heart (ATA: Atypical Angina), Non-heart-related (NAP: Non- AnginalPain), Chest pain not showing signs of disease (ASY: Asymptomatic)
ST_Slope: [Up: exercising raises the heart rate (uncommon), Flat: hardly any change, down: signs of a heart illness.]
HeartDisease: 1- heart disease, 0-Common
C. About KNIME
KNIME gives users the ability to easily create data flows that run either some or all of the analysis processes and then afterward utilize interactive widgets and views to evaluate the results and models.
Steps for creating a KNIME workflw
Drag the CSV reader node into KNIME workflow. Drag and drop the .csv file into CSV reader node.
Partition Node: Make a training set out of 70% of the data rows and a test set out of the remaining 30%.
The decision tree has a learner node to train the model on the training set. This algorithm also has a predictor node to apply the model to other input data. In the test phase, the predictor node is used to apply the trained model to the test data.
The scorer node compares the original classes with the predicted classes in the test records and measures the model performance based on the true and false positive, true and false negative.
D. Applying Algorithms
Comparing 6 machine learning algorithms like Decision tree, Random Forest, and Tree Ensemble, Gradient Boosted Trees (GBT), Naive Bayes, Logistic Regression.
1) Decision Tree Classifier
Decision tree are applicable to two different data mining techniques i.e., classification and prediction. It is used to visually define the rules for simple interpretations and understandings. Pre-processing is done in this approach by separating data into training and test data. This algorithm achieves 65% accuracy. Decision tree classifier using KNIME as shown in Figure 1 and Figure 2 shows decision tree.
V. ACKNOWLEDGMENT
Firstly, I'd like to express my gratitude to the Rajeev Institute of Technology's Computer Science and Engineering Department, as well as Dr. Arjun B.C., Head, department of Information Science and Engineering. Dr. Prakash H.N., Head of Department, CSE, Rajeev Institute of Technology, Hassan. deserves particular appreciation for his ongoing direction and assistance with our project work.
Conclusion
In this paper, we used six different Machine learning techniques to predict heart failure disease. Among these the Random Forest algorithm, which has an accuracy rating of 80.4% in this report\'s results, is the best algorithm for forecasting heart disease.
References
[1] Pavan Kumar T and Avinash Golande, \"Heart Disease Prediction Using Efficient Machine Learning Methods,\" International Journal of Current Technology and Engineering, Vol 8, pp.944-950, 2019.
[2] T. Nagamani, S.Logeswari, and B.Gomathy, \"Heart Disease Prediction Using Data Mining and the MapReduce Algorithm,\" International Journal of Innovative Technology and Engineering, vol 8, pp.944-950,2019.
[3] Fahd Saleh Alotaibi, \"Implementation of Machine Learning Model to Predict Heart Failure Disorder,\" (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 10, No. 6, 2019.
[4] Umair Shafique, Fiaz Majeed, Haseeb Qaiser, and Irfan ul Mustafa \"Data mining in healthcare for cardiac disorders,\" issn 2028-9324 vol. 10, number 4, (2015), pp. 1312-1322.
[5] V. Sabarinathan, \"Diagnosis of heart illness using decision tree,\" international journal of research in computer applications and information technology, vol. 2, number 6, (2014), pp. 74-79.
[6] \"Data mining technique to detect cardiac diseases,\" Vikas Chaurasia, international journal of advanced computer science and information technology (ijacsit), vol. 2, number 4, 2013, pp. 56-66, issn: 2296-1739.
[7] Malkari bhargav and j. Raghunath, \"a study on risk prediction of cardiovascular disease using machine learning algorithms,\" international journal of emerging technologies and creative research (www.jstor.org), vol.7, issue 8, page no.683-688, august 2020.
[8] Gayathri ramamoorthy, \"study of heart disease prediction using various machine learning algorithms,\" international conference on artificial intelligence, smart grid, and smart city applications, 2010. (ais gsc 2019).
[9] Apurb Rajdhan, \"heart disease prediction using machine learning,\" issn: 2278-0181 vol. 9 issue 04, april-2020.
[10] Federico Fedesoriano (September 2021). Dataset on Heart Failure Vaticination. Recaptured (Date recaptured) retrieved from https://www.kaggle.com/fedesoriano/heart-failure- prediction.